AITopics | ai-generated data

Collaborating Authors

ai-generated data

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Understanding nature and nurture: Statistical and AI innovations uncover how genes and environment shape human health Science

ScienceNov-6-2025, 14:01:00 GMT

What makes us who we are? Is it our DNA, passed down through generations, or the environment that shapes our lives? This question--how nature and nurture combine to influence health and behavior--has long captured my curiosity. As I grew up in a multigenerational household, I was struck by the story of my two uncles, identical twins who were genetically indistinguishable but who lived out very different health journeys. One developed severe cardiovascular disease by his early forties; the other stayed healthy into his sixties. What separated them was not biology--it was environment.

ai-generated data, environment shape human health science, nature and nurture, (8 more...)

Science

Genre: Research Report > Experimental Study (0.31)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.55)
Health & Medicine > Consumer Health (0.50)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Psittacines of Innovation? Assessing the True Novelty of AI Creations

Mukherjee, Anirban

arXiv.org Artificial IntelligenceMar-17-2024

We examine whether Artificial Intelligence (AI) systems generate truly novel ideas rather than merely regurgitating patterns learned during training. Utilizing a novel experimental design, we task an AI with generating project titles for hypothetical crowdfunding campaigns. We compare within AI-generated project titles, measuring repetition and complexity. We compare between the AI-generated titles and actual observed field data using an extension of maximum mean discrepancy--a metric derived from the application of kernel mean embeddings of statistical distributions to high-dimensional machine learning (large language) embedding vectors--yielding a structured analysis of AI output novelty. Results suggest that (1) the AI generates unique content even under increasing task complexity, and at the limits of its computational capabilities, (2) the generated content has face validity, being consistent with both inputs to other generative AI and in qualitative comparison to field data, and (3) exhibits divergence from field data, mitigating concerns relating to intellectual property rights. We discuss implications for copyright and trademark law.

brand name, creativity, project title, (17 more...)

arXiv.org Artificial Intelligence

2404.00017

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > Tompkins County > Ithaca (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.67)

Add feedback

A Tale of Tails: Model Collapse as a Change of Scaling Laws

Dohmatob, Elvis, Feng, Yunzhen, Yang, Pu, Charton, Francois, Kempe, Julia

arXiv.org Artificial IntelligenceFeb-10-2024

As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will the scaling laws change in the inevitable regime where synthetic data makes its way into the training corpus? Will future models, still improve, or be doomed to degenerate up to total (model) collapse? We develop a theoretical framework of model collapse through the lens of scaling laws. We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, the ''un-learning" of skills, and grokking when mixing human and synthesized data. Our theory is validated by large-scale experiments with a transformer on an arithmetic task and text generation using the large language model Llama2.

model collapse, scaling law, synthesized data, (16 more...)

arXiv.org Artificial Intelligence

2402.07043

Country:

North America > United States > New York (0.04)
North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Add feedback

The perpetual motion machine of AI-generated data and the distraction of ChatGPT-as-scientist

Listgarten, Jennifer

arXiv.org Artificial IntelligenceNov-29-2023

Since ChatGPT works so well, are we on the cusp of solving science with AI? Is not AlphaFold2 suggestive that the potential of LLMs in biology and the sciences more broadly is limitless? Can we use AI itself to bridge the lack of data in the sciences in order to then train an AI? Herein we present a discussion of these topics.

information, perpetual motion machine, synthetic data, (15 more...)

arXiv.org Artificial Intelligence

2312.00818

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report (0.41)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

ChatGPT generates fake data set to support scientific hypothesis

NatureNov-22-2023

The artificial-intelligence model that powers ChatGPT can create superficially plausible scientific data sets.Credit: Mateusz Slodkowski/SOPA Images/LightRocket via Getty Researchers have used the technology behind the artificial intelligence (AI) chatbot ChatGPT to create a fake clinical-trial data set to support an unverified scientific claim. In a paper published in JAMA Ophthalmology on 9 November1, the authors used GPT-4 -- the latest version of the large language model on which ChatGPT runs -- paired with Advanced Data Analysis (ADA), a model that incorporates the programming language Python and can perform statistical analysis and create data visualizations. The AI-generated data compared the outcomes of two surgical procedures and indicated -- wrongly -- that one treatment is better than the other. "Our aim was to highlight that, in a few minutes, you can create a data set that is not supported by real original data, and it is also opposite or in the other direction compared to the evidence that are available," says study co-author Giuseppe Giannaccare, an eye surgeon at the University of Cagliari in Italy. The ability of AI to fabricate convincing data adds to concern among researchers and journal editors about research integrity.

artificial intelligence, machine learning, natural language, (14 more...)

Nature

AI-Alerts: 2023 > 2023-11 > AAAI AI-Alert for Nov 28, 2023 (1.00)

Country:

Europe > Italy > Sardinia > Cagliari > Cagliari (0.25)
North America > United States > California > San Francisco County > San Francisco (0.15)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.90)

Industry:

Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.89)
Health & Medicine > Pharmaceuticals & Biotechnology (0.57)
Health & Medicine > Surgery (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

AI Is an Existential Threat to Itself

The Atlantic - TechnologyJun-21-2023, 20:48:32 GMT

In the beginning, the chatbots and their ilk fed on the human-made internet. Various generative-AI models of the sort that power ChatGPT got their start by devouring data from sites including Wikipedia, Getty, and Scribd. They consumed text, images, and other content, learning through algorithmic digestion their flavors and texture, which ingredients go well together and which do not, in order to concoct their own art and writing. Generative AI is utterly reliant on the sustenance it gets from the web: Computers mime intelligence by processing almost unfathomable amounts of data and deriving patterns from them. ChatGPT can write a passable high-school essay because it has read libraries' worth of digitized books and articles, while DALL-E 2 can produce Picasso-esque images because it has analyzed something like the entire trajectory of art history.

chatgpt, model collapse, tailed jackrabbit, (14 more...)

The Atlantic - Technology

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Texas > Travis County > Austin (0.05)
North America > United States > California (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)

Genre: Research Report (0.71)

Industry:

Education (0.35)
Health & Medicine (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

The top 5 open-source tools for visualizing AI-generated data

#artificialintelligenceOct-29-2020, 02:30:03 GMT

The ability to build artificial intelligence (AI) or machine-learning (ML) models is moving quickly away from the data scientist's domain and toward the citizen developer. Creating results from AI is getting easier, thanks to open-source tools that can convert AI/ML data streams into clear information that drives visualizations. It's essential to visualize AI and ML data in a way that helps you draw insights and find trends and patterns. The quality and quantity of the data available to you are critical factors. A visual representation should have some basic features.

artificial intelligence, machine learning, visualization, (17 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback